Lecture 17 - Introduction to SQL
30 October, 2024
postgres user (you can use postgres as the password)
- There’s also a tutorial on our website
localhost5432postgrespostgrespostgres with the right mouse button, then click on Connect Serverpostgres databaseDatabases, then postgres below it. After this, click with the right mouse button and go to Query ToolTABLE called drivers,) that contain the column_name and column_type, respectivelydriver_id is an int, the driver_name is a varchar with a maximum of 30 characters, the nationality is a varchar with a maximum of 15 characters, and the victories is an intdriver_id is the primary key, which means it is unique for each driverExecute (▶️) button to run the codeINSERT INTO commandVALUES to the bottom of the table;)INSERT INTO drivers VALUES (1, 'Lewis Hamilton','British', 103);
INSERT INTO drivers VALUES (4, 'Fernando Alonso', 'Spanish', 32);
INSERT INTO drivers VALUES (3, 'Sebastian Vettel', 'German', 91);
INSERT INTO drivers VALUES (2, 'Michael Schumacher', 'German', 53);Execute (▶️) button to run the codeData Output tab (bottom left) to see the results… but it’s empty! 🧐:::
Create a table called students with the following columns:
student_id (int)student_name (varchar, 30 characters)major (varchar, 30 characters)gpa (float)Add the following rows to the table:
student_id: 1, student_name: ‘John Doe’, major: ‘Computer Science’, gpa: 3.5
student_id: 2, student_name: ‘Jane Doe’, major: ‘Mathematics’, gpa: 3.8
Query the table to see the results
SQL Notebook icon on the left side of the screenNew SQL Connectionlocalhost (or any name you want)postgreslocalhost5432postgrespostgresCreate.sql with SQL Notebookexample.sql file from our websiteOpen With menu option (right button)SUM() or AVG()AS is an alias which assigns a name to the aggregate statisticSELECT SUM(victories) AS sum_victories,
COUNT(*) AS num_rows,
AVG(victories) AS mean_victories,
MIN(victories) AS min_victories,
MAX(victories) AS max_victories
FROM drivers;SELECT command to get the average victories (a subquery)WHERE clause. See the reason why here. More about subqueries here.HAVING)GROUP BY to group data by a columnSELECT nationality,
SUM(victories) AS sum_victories,
AVG(victories) AS mean_victories,
MIN(victories) AS min_victories,
MAX(victories) AS max_victories
FROM drivers
GROUP BY nationality;ROUND() to round the valuesSELECT nationality,
SUM(victories) AS sum_victories,
ROUND(AVG(victories), 1) AS mean_victories,
MIN(victories) AS min_victories,
MAX(victories) AS max_victories
FROM drivers
GROUP BY nationality;HAVING to filter variables after aggregatingWHERE, which is used to filter variables before aggregatingHAVING| SQL Command | Pandas Equivalent |
|---|---|
SELECT |
df[['column1', 'column2']] |
INSERT INTO |
df.loc[new_index] = new_row or df = df.append(new_row, ignore_index=True) |
WHERE |
df[df['column'] == value] |
GROUP BY |
df.groupby('column') |
HAVING |
df.groupby('column').filter(lambda x: condition) |
ROUND() |
df['column'].round(decimals) |
| Aggregate Functions | df.groupby('column').agg({'col1': 'sum', 'col2': 'mean', 'col3': 'min', 'col4': 'max'}) |
students table and add the rowsExecute (▶️) button to run the code